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IN THE CLAIMS 

Please cancel claims 1-16 without prejudice. 
Please add claims 17-35 as follows: 

1 7. (New) A method for clustering documents, including generating clusters with user 
perspective comprising: 

receiving session logs; 

performing log-based clustering on the session logs to generate session clusters; 
representing each session cluster as a log-based document suitable for content based 
clustering; 

receiving a plurality of documents that includes a first document that was accessed 
in one session and a second document that was not accessed in the sessions; 

replacing the first document with a log-based document associated with the 
session cluster that includes the first document; and 

performing content based clustering on at least the first document and the second 
document to generate clusters with user perspective. 

18. (New) The method of claim 17 wherein representing each session cluster as a log- 
based document suitable for content based clustering includes modifying each document 
referenced in the session cluster so that a Euclidean distance between the documents is the 
same. 

19. (New) The method of claim 17, wherein each session log comprises a query used to 
retrieve documents. 

20. (New) The method of claim 17, wherein each session log comprises a number of 
documents found to satisfy a query. 

2 1 . (New) The method of claim 1 7, wherein each session log comprises a list of 
documents opened by a user. 
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22. (New) The method of claim 17, wherein each session log comprises a length of time 
that a document was opened. 

23. (New) A method for clustering documents comprising: 

generating a hybrid matrix of vectors comprising a first vector representing a first 
document and a second vector representing a log-based document cluster; and 
clustering the documents using the hybrid matrix. 

24. (New) The method of claim 23 wherein a second vector is used in place of a second 
document within the hybrid matrix wherein the second document forms a portion of the log- 
based document cluster. 

25. (New) The method of claim 23 wherein clustering the documents using the hybrid 
matrix is performed using a content-based clustering technique. 

26. (New) The method recited in claim 23 wherein generating the hybrid matrix 
comprises: 

accessing retrieval session logs; 

clustering retrieval sessions into session clusters; 

generating, a log-based document cluster for each session cluster by combining all 
documents opened during any retrieval session of the session cluster; 

generating a log-based document cluster vector for each of the log-based document 
clusters; 

replacing each document in the log-based document cluster with the log-based 
document cluster vector; 

generating an individual document vector for each document not opened during any 
retrieval session; and 

combining the log-based document cluster vector and the individual document cluster 

vector. 
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27. (New) The method of claim 26 wherein the step of clustering retrieval sessions into 
session clusters comprises the steps of: 

generating a Boolean session vector for each retrieval session; 
forming a matrix of the Boolean session vectors; and 

applying a clustering algorithm to the matrix of the Boolean session vectors. 

28. (New) A system for clustering documents, the system comprising; 
a storage for storing retrieval session logs; and 

a processor connected to the storage, configured to cluster the retrieval sessions into 
session clusters, generate, for each session cluster, a log-based document cluster, generate a 
log-based document cluster vector for each of the log-based document clusters, generate an 
individual document vector for each document not opened during any retrieval session, 
cluster the documents using the log-based document cluster vectors and individual document 
vectors. 

29. (New) The system of claim 9 wherein the documents are stored in the storage. 

30. (New) The system of claim 9 further comprising: 

a memory connected to the processor, for storage of a hybrid matrix comprising the 
log-based document cluster vectors and the individual document vectors. 

3 1 . (New) A data processing system having session logs and documents, the system 
comprising: 

a processor for executing program instructions; and 

a media readable by the processor having a document clustering module having a 
plurality of instructions, that when executed by the processor, performs log-based clustering 
on the session logs to generate session clusters, converts the session clusters into a form 
suitable for content-based clusters, performs content-based clustering on the documents and 
session clusters in a form suitable for content-based clustering to generate document clusters 
with users' perspective. 
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32. (New) The system of claim 1 2 wherein the document clustering module further 
comprises: 

a session vector generation module for receiving the session logs and based thereon 
for generating a session vector for each session log; 

a session cluster generation module coupled to the session vector generation module 
for receiving the session vectors and based thereon for generating session clusters; 

a hybrid matrix builder for receiving the documents, coupled to the session cluster 
generation module, for receiving the session clusters and based thereon for generating a 
hybrid matrix having at least one log-based document; and 

a topic generation module coupled to the hybrid matrix builder for receiving the 
hybrid matrix and based thereon for generating document clusters with users' perspective. 

33. (New) The system of claim 32 wherein the hybrid matrix builder further comprises: 
a session document generation module for receiving session clusters and based 

thereon generates super documents; and 

document modification module coupled to the session document generation module 
for receiving the super documents, for receiving the documents, and based thereon for 
generating the hybrid matrix. 

34. (New) The system of claim 3 1 wherein the media is one of a floppy disk, compact 
disc, a volatile memory, and a non- volatile memory. 

35. (New) A machine readable memory device encoded with a data structure for 
clustering documents, the data structure having entries for a log-based document cluster 
vector generated from a log-based document cluster, and an individual document vector 
corresponding to a vector generated from a first document, the first document not belonging 
to any log based document cluster. 
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